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Abstract 

A toy detector has been designed to simulate central detectors in reactor neutrino exper- 
iments in the paper. The electron samples from the Monte-Carlo simulation of the toy 
detector have been reconstructed by the method of Bayesian neural networks (BNN) and 
the standard algorithm, a maximum likelihood method (MLD), respectively. The result 
^ ■ of the event reconstruction using BNN has been compared with the one using MLD. 

Compared to MLD, the uncertainties of the electron vertex are not improved, but the 
energy resolutions are significantly improved using BNN. And the improvement is more 
c/j ■ obvious for the high energy electrons than the low energy ones. 
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1 Introduction 

O 



The main goals of reactor neutrino experiments are to detect v e — ► u x oscillation 
and precisely measure the mixing angle of neutrino oscillation #13. The experiment 



is designed to detect reactor i/ e 's via the inverse /3-decay reaction 



u e + p — > e + + n 

The signature is a delayed coincidence between e + and the neutron captured 
signals. It is very important to reconstruct the energy and the vertex of a signal 
detected in the experiments. The standard algorithm of the event reconstruction 
in the experiments is a maximum likelihood method (MLD from now on). But 
the method of Bayesian neural networks (BNN from now on)[l] is more suitable 
than MLD for the event reconstruction of reactor neutrino experiments. BNN is 
an algorithm of the neural networks trained by Bayesian statistics. It is not only 
a non-linear function, but also controls model complexity. So its flexibility makes 
it possible to discover more general relationships in data than the traditional 
statistical methods and its preferring simple models make it possible to solve the 
over- fitting problem better than the general neural networks[2]. In this paper, 
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BNN is applied to the event reconstruction of the electron samples from the 
Monte-Carlo simulation of a toy detector of reactor neutrino experiments. And 
the result of the event reconstruction using BNN is compared with the one using 
MLD. 

2 The Regression with Bayesian Neural Networks[l, 3] 

The idea of BNN is to regard the process of training a neural network as a Bayesian 
inference. Bayes' theorem is used to assign a posterior density to each point, 9, 
in the parameter space of the neural networks. Each point 9 denotes a neural 
network. In the method of BNN, one performs a weighted average over all points 
in the parameter space of the neural network, that is, all neural networks. The 
methods make use of training data {(xi,ti), (x2,t2),---,(x n ,t n )}, where ti is the 
known target value associated with data x i7 which has P components if there are 
P input values in the regression. That is the set of data x —(xi,x 2 ,...,x n ) which 
corresponds to the set of target t —(ti,t2,---,t n ). The posterior density assigned to 
the point 9, that is, to a neural network, is given by Bayes' theorem 



/- \ P[x,t\9)p(9) p(t\x,e) p (x \ 9) p (9) p[t\x,9)p{9) 

p(e\x,t)= — * — ~\ — = — * ' \ , { w = — * — . / r w 

v ' P{x,t) p{t\xjp{x) p[t\x) 

(1) 

where data x do not depend on 9, so p (x \ 9) = p (x). We need the likelihood 
p (t | x, 9^j and the prior density p (jPj , in order to assign the posterior density 

p (9 I x, tjto a neural network defined by the point 9. p(t \ x) is called evidence 
and plays the role of a normalizing constant, so we ignore the evidence. That is, 



Posterior- oc Likelihood x Prior (2) 
We consider a class of neural networks defined by the function 



y(x,9^ =b + ^2 v i sin \ a i + u ij x i ( 3 ) 
j=i \ i=i / 

The neural networks have P inputs, a single hidden layer of H hidden nodes and 
one output. In the particular BNN described here, each neural network has the 
same structure. The parameter Uij and Vj are called the weights and aj and b are 
called the biases. Both sets of parameters are generally referred to collectively as 
the weie: hts of the BNN, 9. y (x, 0) is the predicted target value. We assume that 
the noise on target values can be modeled by the Gaussian distribution. So the 
likelihood of n training events is 



n n 

p (t I x,8) = J] exp[-((U - y (x h $)) 2 /2a 2 ] = exp[- ^(U - y (x h 9) /2a 2 )} (4) 
i=i i=i 



where U is the target value, and a is the standard deviation of the noise. It has 
been assumed that the events are independent with each other. Then, the 
likelihood of the predicted target value is computed by Eq. (4). 

We get the likelihood, meanwhile we need the prior to compute the posterior 
density. But the choice of prior is not obvious. However, experience suggests a 
reasonable class is the priors of Gaussian class centered at zero, which prefers 
smaller rather than larger weights, because smaller weights yield smoother fits to 
data . In the paper, a Gaussian prior is specified for each weight using the 
Bayesian neural networks package of Radford Neal 1 . However, the variance for 
weights belonging to a given group (either input-to-hidden weights hidden 
-biases (aj), hidden-to-output weights (vj) or output-biases (&)) is chosen to be the 
same: a 2 , a 2 , a 2 , of, respectively. However, since we don't know, a priori, what 
these variances should be, their values are allowed to vary over a large range, 
while favoring small variances. This is done by assigning each variance a gamma 
prior 



"W = UJ-rw- < 6 > 

where z = cr~ 2 , and with the mean \i and shape parameter a set to some fixed 
plausible values. The gamma prior is referred to as a hyperprior and the 
parameter of the hyperprior is called a hyperparameter. 

Then, the posterior density, p (O \ x, tj, is gotten according to Eqs. (2), (4) and 
the prior of Gaussian distribution. Given an event with data x', an estimate of 
the target value is given by the weighted average 



y ( x '\x, t) = j y (x', 9)p(e\ x, tj d9 (6) 

Currently, the only way to perform the high dimensional integral in Eq. (6) is to 
sample the density p (O \ x, tj with the Markov Chain Monte Carlo (MCMC) 
method[l, 4, 5, 6]. In the MCMC method, one steps through the 9 parameter 
space in such a way that points are visited with a probability proportional to the 
posterior density, p (9 \ x, tj . Points where p (9 \ x, tj is large will be visited more 

often than points where p {& \ x, tj is small. 

Eq. (6) approximates the integral using the average 



y{x'\x,t)^\j^y(x'A) (7) 
L i=i 

where L is the number of points 9 sampled from p {& \ x, tj . Each point 9 
corresponds to a different neural network with the same structure. So the 
average is an average over neural networks, and is closer to the real value of 
y (x' | x,t), when L is sufficiently large. 



1 R. M. Neal, Software for Flexible Bayesian Modeling and Markov Chain Sampling, 
http: / /www. cs.utoronto.ca/~radford/fbm. software, html 



3 Toy Detector and Simulation 



3.1 Toy Detector 

In the paper, a toy detector is designed to simulate central detectors in the re- 
actor neutrino experiments, such as Daya Bay experiment [7] and Double Chooz 
experiment [8], with CERN GEANT4 package [9]. The toy detector consists of 
three regions, and they are the Gd-doped liquid scintllator(Gd-LS from now on), 
the normal liquid scintillator(LS from now on) and the oil buffer, respectively. 
The toy detector of cylindrical shape like the detector modules of Daya Bay ex- 
periment and Double Chooz experiment is designed in the paper. The diameter of 
the Gd-LS region is 2.4 meter, and its height is 2.6 meter. The thickness of the LS 
region is 0.35 meter, and the thickness of the oil part is 0.40 meter. In the paper, 
the Gd-LS and LS are the same as the scintillator adopted by the proposal of the 
CHOOZ experiment [10]. The 8-inch photomultiplier tubes (PMT from now on) 
are mounted on the inside the oil region of the detector. A total of 366 PMTs are 
arranged in 8 rings of 30 PMTs on the lateral surface of the oil region, and in 5 
rings of 24, 18, 12, 6, 3 PMTs on the top and bottom caps. 

3.2 Monte-Carlo Simulation of Toy Detector 

The response of the electron events deposited in the toy detector is simulated with 
GEANT4. Although the physical properties of the scintillator and the oil (their 
optical attenuation length, refractive index and so on) are wave-length dependent, 
only averages[10] (such as the optical attenuation length of Gd-LS with a uniform 
value is 8 meter and the one of LS is 20 meter) are used in the detector simulation. 
The program couldn't simulate the real detector response, but this won't affect 
the result of the comparison between BNN and MLD. The program allows us to 
simulate the detector response for the electron events of the different energy and 
vertex. In the paper, 10000 electron events regarded as the training sample are 
uniformly generated throughout Gd-LS region and their energy is also uniformly 
generated from 1 MeV to 13 MeV. 3000 electron events regarded as the 1 MeV test 
sample are generated uniformly throughout Gd-LS region. The test samples from 
2 MeV to 8 MeV are generated in the same way, respectively. 

4 Event Reconstruction 

The task of the event reconstruction in the reactor neutrino experiments is to re- 
construct the energy and the vertex of a signal. The maximum likelihood method 
is a standard algorithm of the event reconstruction in the reactor neutrino exper- 
iments. The likelihood is defined as the joint Poisson probability of observing a 
measured distribution of photoelectrons over the all PMTs for given (E, it) co- 
ordinates in the detector. The Ref.[ll] for the work of the CHOOZ experiment 
shows the method of the reconstruction in detail. The algorithm of BNN is also 
applied to event reconstruction, and its result is compare with the one of MLD. 

4.1 Event Reconstruction with MLD 

In the paper, the event reconstruction with the MLD are performed in the similar 
way with the CHOOZ experiment [11], but the detector is different from the detec- 



tor of the CHOOZ experiment, so compared to Ref.[ll], there are some different 
points in the paper: 

(1) The detector in the paper consists of three regions, so the path length from 
a signal vertex to the PMTs consist of three parts, and they are the path length 
in Gd-LS region, the one in LS region, and the one in oil region, respectively. 

(2) Considered that not all PMTs in the detector can receive photoelectrons 
when a electron is deposited in the detector, the x 2 equation is modified in 
the paper and different from the one in the CHOOZ experiment, that is, x 2 — 
Y,Nj=oNj + ^Nj^o(Nj — Nj + Njlog(jf:)), where Nj is the number of photoelec- 
trons received by the j-th PMT and Nj is the expected one for the j-th PMT[11]. 

(3) ce x Ntotai and the coordinates of the charge center of gravity for the all 
visible photoelectrons from a signal are regarded as the starting values for the fit 
parameters(-E, of), where N tota i is the total numbers of the visible photoelectrons 
from a signal and ce is the proportionality constant of the energy E, that is, 
E = ce x N to tai- ce is obtained through fitting A^ ota ;'s of the 1 MeV electron 
events, and is 235/MeV m ^ ne P a P er - 

(E, it) of the all electron events, including the test sample and the training 
sample, are reconstructed using MLD. 

4.2 Event Reconstruction with BNN 

In the paper, the Cartesian coordinates (x, y, z) of the all events, including the test 
sample and the training sample, are transformed to their cylindrical coordinates 
(r, 9, z). The (E, r, 9, z) are used as inputs to the BNN, which have the input layer 
of 4 inputs, the single hidden layer of 8 nodes and the output layer of a output which 
is E, or x,y,z, respectively. The E and x,y,z of the test samples are predicted 
using the BNN, respectively. A Markov chain of neural networks is generated 
using the Bayesian neural networks package of Radford Neal, with the training 
sample, in the process of the event reconstruction. One thousand iterations, of 
twenty MCMC steps each, are used in the paper. The neural network parameters 
are stored after each iteration, since the correlation between adjacent steps is very 
high. That is, the points in neural network parameter space are saved to lessen 
the correlation after twenty steps. It is also necessary to discard the initial part 
of the Markov chain because the correlation between the initial point of the chain 
and the points of the part is very high. The initial three hundred iterations are 
discarded in the paper. 

5 Conclusion 

Fig.l, Fig. 2 and Fig. 3 illustrate the results of the event reconstruction with BNN 
and MLD. Fig.l shows that the errors of the vertex of 1 MeV and 8 MeV 
electrons reconstructed by the BNN are consistent with the ones by MLD, that 
is, they are not obviously different. Fig. 2 shows that the energy uncertainty for 1 
MeV electrons with BNN decreases by 95.0% in comparison with the one using 
MLD. And the uncertainty in the case of the 8 MeV events decreases by 76.3%. 
Fig. 3 shows the energy resolutions using BNN are more significantly improved in 
comparison with the one using MLD while increasing energy. Meanwhile, the 
relative errors of the energy resolutions are about 2.0%, and are from fit errors 
(about 1.5%) and statistical errors (about 1.3%). So the difference between 



results of BNN and MLD is not significant in the case of 1 MeV events in 
consideration of the effect of statistical fluctuations. But the contribution to the 
difference is mainly from the superiority of BNN for the events from 2 MeV to 8 
MeV. Thus it can be seen that the energy resolutions using BNN are significantly 
improved for the high energy events in comparison with the one using MLD. 
Therefore, BNN can be well applied to the energy reconstruction in the reactor 
neutrino experiments, and the better energy resolution is obtained by BNN. 

Although the discussion in the paper are only for the reactor neutrino 
experiments, it is expected that the algorithm of BNN can also be applied to the 
event reconstruction of the other experiments and will find wide application in 
the experiments of high energy physics. 
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Fig. 1: 5x,5y,5z is the difference between the coordinates of the reconstructed 
position and the generated ones, respectively. The event position is recon- 
struted using BNN and MLD, respectively. (a)(c)(e) illustrate the difference 
distribution of the 1 MeV electrons, and (b)(d)(f) illustrate the one of 8 
MeV electrons. 
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Fig. 2: The energies of 1 MeV and 8 MeV electrons are reconstructed using BNN 
and MLD, respectively, (a), (b) illustrate the distribution of the energy 
reconstructed by BNN for 1 MeV and 8 MeV electrons, respectively, (c), 
(d) illustrate the distribution of the energy reconstructed by MLD for 1 
MeV and 8 MeV electrons, respectively. 
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Fig. 3: The energy resolution of the test sample from 1 MeV to 8 MeV are shown 
in the figure. The white squares denote the resolutions using MLD, and 
the black squares denote the one using BNN. 



